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ABSTRACT 

Langmuir's model is studied for the situation where 
epsilon IS independently and identically normally distributed. The 
"Y/x" versus "Y" plot had a 90% mid-range that did not contain the 
true curve in a vast portion of the range of "x". The "l/Y" versus 
"1/chi" plot had undefined expecced values, and this problem worsens 
as sample size increases. The use of non-linear least squares is 
recommended. In non-linear regression, it is demonstrated that a 
design that gives at least a local minimum of the generalized 
variance of the parameter estimators is one where half the 
observations are taken at the maximal value of "x" (termed "x-theta") 
and half of the epsilons equal half their maximal value. The use of 
such an extreme design is optimal in simple linear regression; 
^owever, it is curious that this design is optimal in non-linear 
Langmuir's model. Six graphs, five data tables, and numerous 
equations are provided. (Author/SLD) 
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Abstract 



We study Langmuir's model, Y = -r 6, where e is independently 
and identically normally distributed, iV(0, <7^). The Y/x versus Y plot had 
a 90% midrange which did not contain the true curve in a vast portion of 
the range of x. The l/Y versus l/x plot has undefined expected values, and 
gets worse as the sample size increases. We recommend the use of nonlinear 
least squares. 

In nonlinear regression, we prove that a design which gives at least 
a local minimum of the generalized variance of the parameter estimators 
is one where half the observations are at the maximum value of x, called 
x^, and the remainder at x = 2+bxoo ' ^^^^ ^ extreme design 

ic optimal in simple linear regression. It is curious that it is optimal in 
nonlinear Langmuir's model. 
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I. Introduction 

A. BASIC MATERIAL: The Langmuir model or isotherm is defined by 



where x is the control variable, free from error. The response variable Y, 
is measured with error e, which is not a function of x. We assume that the 
errors e, , i = 1, ... , n are i.i.d. N{0, a^) both for the sake of simplicity 
and to go along with the literature. Our objective is to estimate a and b. 

Chemical engineers and biologists who wish to estimate a and b com- 
monly avoid minimizing this sum of squares: 




(1) 



Instead, they minimize some other sum of squares, such as the following: 
Linearization I: 




(2) 



Linearization II, Scatchard: 




or plot — - 

X 



versus Y 



(3) 



Linearization III, Lineweaver-Burk: 




or plot — versus — 
Y X 



(4) 



Each transformation to linearize the model, or linearization, is obtained 
by modifying Y ~ ax/ {I + 6x) without regard to e. Presumably, nonlinear 



models are linearized not because of the structure of the error, but to avoid 
difficulties in numerically obtaining nonlinear least square estimates, and 
permit graphing a straight line. The question which we raise is: what is the 
effect of the error term on such a linearization and on the inference made? 

We investigate these methods of estimating the parameters, for the 
usual design of the control variable, x. Our conclusion, along with that of 
Colquhoun^ is that the linearization does not work! On the other hand, 
nonlinear estimation poses no particular problem when one has access to a 
computer. We also develop designs of x values to obtain better parameter 
estimates, under the assumption that the Langmuir's model is true* 

B. DERIVATION OF THE MODEL: Langmuir^ derived his isotherms in 
1916. We use chemistry notation to derive the model. A compound of 
components A and S is written AS. The concentration of a compound A is 
written [A], using the brackets to denote the concentration measure. 

For one mobile phase compound .4, and stationary phase compound 5, 
the reaction A + S ^ AS has the reaction equilibrium constant 



By the conservation of mass, the number of components of S is some 
constant c, where [S] + [AS] = c. So 



[A]{c-[AS]y 
and = [A5] + fc[A][A5], and 

ck[A] 



[AS] = 



1 + klA] 



To translate the model into the usual notation of statistics, define 
a = ck, b = Y = [AS], and x = [.4], so S{Y) = The model can 

be derived for Michaelis-Menten reactions similarly, shown in many text- 
books. 



Doug Doren, a chemist at the University of Delaware, knew that one 
can optimize over parameter estimation by taking equally many observa- 
tions as far apart as possible after linearization. Based on the derivation of 
the Michaelis-Menten, we suspect that some people have guessed our opti- 
mal design. It seems that no one has shown its optimality using analytical 
methods before us. So our claim to fame is in proving something which the 
best minds have only suspected. 

C. NONLINEAR ESTIMATION: In general, one finds the simultaneous 
zero of the partial derivatives of to minimize in equation (1): = 0 

and 1^ = 0. 

In our problem, we found a transformation of these equations helpful. 
Since the model is linear in parameter a, we can solve in closed form to 
obtain a quantity h(h) such that a - h(h) : 

Next, we transform the second equation, to an equation of the form 

r ' 4- — ^ = 0 
da* ^ db* 

We take the approach of taking the derivative of X'lHb*), b*] with respect 
to 6* to obtain: 

d6* dh{b*) db* ~ 

Define z,- = to simplify the expression. So, the derivative of h{b) 
with respect to 6 is 



4 

5 
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We complete the evaluation of the formula to obtain the formula used 
on the computer: 



9(b) =- E 



i=l 



1 + 6x, 



1 + 6x, 



h'{b) - 



h{b)Xi 

1 6X; 



= 0 



We solve this equation for b to obtain a = k{b), without checking 

= 0, since we have already solved that problem analytically. 
We use bisection, or binary search to find 6*, where 6* is such that 
5'(6*) ~ 0. We define the sign function as follows: 

f 1 iig>0 
sgn(/) = { 0 ify = 0 
l-l ify<0 

We first find values 6j, 63 such that the signs axe different, 
sgnb(^)] 7^ sgn[y(63)]. Next, assign 63 ^ If sgnb(6i)] = sgn[y(62)], 

we assign 6j <— 62. Otherwise, if the solution is not in hand, we assign 
^3 ^ ^2 • process is repeated till we know the least squares 6 to enough 
digits. Bisection gave reasonable answers for 32 000 simulations. Hence 
this method is recommended to obtain the nonlinear least 
squares for Langmuir's model. We do not need the general purpose non- 
linear estimation schemes such as the maximum descent, along with the 
transformations of Habibullah^. For the two and more solute extensions 
of Langmuir's model, general purpose numerical minimizations might be 
necessary. 



II. Simulation Ba.sed Results 1 

A. LINEAR III W^ORSE AS SAMPLE INCREASES: For each simulation 
in this section, we use the design (l/n, 2/n, ... , n/n). For each x,, we 
have an expected value of 1^^, that is ^^^j^^ , for x^, Xo, . . . , x„. So 

Y ■ = [• € ■■ 



Er|c 6 



for / = 1, 2, ... , n, and i = 1, 2, ... , 2000. The e,-^- come from a normal 
random number generator, with mean zero and standard deviation 0.2. For 
each sample (l^-y, x-^), i = 1, ... , n, we estimate (a, 6). So we have 2000 
estimates of (a, 6) for each case considered. We used a 25 and 6 = 10, 
the usual parameter estimates (rounded to integer values) given by Anita 
Katti, from her previous work at Oak Ridge National Labs. 

Note that at x = 0, S(Y) = 0. So, the relative error is infinite. At x = 
oo, the relative error is 8%. Since the expected value of Vj- is proportional to 
a, and we are using least squares, root n asymptotics apply. So we expect 
similar results for other parameterizations with our values oi (T/{n^f^a). We 
chose a combination of standard dev- ^tion and sample size to illustrate our 
results. 

The "range" of estimates a is the largest of the 2000 simulated esti- 
mates less the smallest. The Pr(6 < 0) is the probability observed in the 
simulation. 



Table 1 


Estimates 


a, with a = 25, 


(7 =.2. 


size: 


10 


40 


160 


640 


III range: 


90 


73 


95 338 


131 000 


NL range: 


81 


25 


12 


7 


III min: 


7 


16 


-12 


-46 460 


NL min: 


12 


17 


20 


22 


Pr(6 < 0) : 


0 : 


1% 


31.8% 


39.7% 



Table 1 shows the behavior of methods III and NL for increasing sample 
sizes, picking x = {^, f, f, J}. 

Since the density of Y is continuous and nonzero at zero, and lineariza- 
tion III contains -p, the theoretical moments are infinite or undefined. This 

shows up as the vast range 131 000 for a when method III is used, while 
the range for the nonlinear decreases to 7. Thus, the probabilities that gave 
reasonable midrangtd for method III and small sample sizes are less helpful 
for larger sample sizes. 

Figure 1 contains a graph of Y = We have 2000 graphs. To 

reduce volume, we take a point x on the x-axis. We compute all of the 



2000 r, and rank them. We throw away the lowest and the highest 5% 
or 100 of the K-values to obtain an interval which contains the remaining 
90%. We plot the 101st and 1900th points. We call them the .90 midrange. 
Linearization III has much larger midranges than nonlinear. 



Figure 1 is a comparison between linearization "III" and nonlinear, 
"NL" least squares. The sample sizes are 40 and 640. The true curve 
is S{Y) = 25x/(l + lOx), and is labeled "true." The midrange curves 
are labeled "III" for estimation using lineEirization "III". The midrange 
curves for nonlinear least squares are labeled "NL". The standard deviation, 
(7 = Var^/^(6), is 0.2 in the 2000 simulated samples used. 

For the cases of n=40 and n=640 values of x, located at ^, 
i = 1, ... , /I, we see that the nonlinear least squares midrange is nearly 
symmetric about the true curve and closer to the true curve than the lin- 
earization III midranges for the vast majority of values of x. Linearization 
III given in Figure 3, part B with sample size 640 is distinctly worse than 
the curve in part A for a sample size of only 40. This is due to the reciprocal. 
For n=640, there are more observations near x=0 than for n=40. 



Linear III versus Nonlinear .90 MidRcinges 




41 



A. Sample Size n=40 
Fig. 1. 



B. Sample Size n=640 
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At sample size 640, about 40% of the estimates of b are negative for 
linearization III, given in table 1. These negative estimates lack physical 
meaning, and yield the negative midrange curve values for x < .1, for 
linearization III. 

The loss of accuracy with increasing data, shown in table 1 and figure 
3, is not acceptable. So, we set aside linearization III. 

B. FURTHER COMPARISONS BY SIMULATION: Next , we compare es- 
timates obtained through least squares on linearizations I and II with non- 
linear least squares. We chose the following thirty values of x. 

(x,) = (.1, .1, .1, .2, .2, .2, .3, .9, 1, 1, 1). 




Figure 2 contains graphs of the midranges for the methods Lineariza- 
tion I, Linearization II, and NonLinear (NL) estimation. Circles are drawn 
on the true curve, S{Y) = 25x/(l + lOx), to indicate the locations of the 
observations. The midrange curves for nonlinear regression are in dashes. 

ERIC o 
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Vv'e use F'igure 2 to compaxe linearizatious "F and "11" with non- 
linear least Sijuares. The standard deviation is 0.2, and the design is 
(x,) = (.1, .1, .1, .2, .2, .2, .3, ... 1). The midrange curves for nonlinear 
least squares cxre labeled "NL", and consistently contciin the true value, 
quite unlike the midranges for the linearizations. 

Table 2 For Parameter estimates a, for a = 25 
Design (x) = (.1, .1, .1, .2, . . . , 1), a = 0.2 

Linear I Linear II Nonlinear 
range: 15.508 21.182 34.8 
Cor(a, b) : .99997 .9989 .99435 
mean: 6.467 17.885 25.559 

To get a better assessment of the situation, we look at the estimated co- 
relations in Table 2. We use the following mathematization of co-relation, 
for estimators 0^, d^, with parameters 8^, $2 : 

Co - relation!*,, ^2) = " ^■)(^2 - ^2) 

To estimate the co-relation, we substitute means for expectations. The 
estimated co-relations are .994 and up, so the parameter estimates are 
highly co-related, making accurate estimation difficult. These co-relations 
and the nonelliptic squared error contour graphs of Colquhoun^ imply that 
finding the minimum using the maximum descent method can be difficult. 
Habibullah^ transforms such contours to approximately circular contours 
by transforming the parameters. He transformed concentric banana shaped 
contours, which apparently were not asymptotically elliptic, to roughly cir- 
cular coiiiours. In one of the problems, the reduction was from 13010 iter- 
ations for the usual maximum descent to 19 iterations after the transform, 
and from 9425 iterations to 41 in another. Eastham et al."^ found trans- 
formation of the parameters quite useful in finding maximum likelihood 
estimators for the three parameter lognormal distribution. We expect the 
trick of transforming the parameters to permit the use of numerical min- 



iiiiizing even when the two and more solute exten.ssions of the Langmuir's 
models do not permit the simplicity of using bisection as it was done here. 

In view of the regions in which the midrange does not include the true 
curve, and Table 2, we discard methods I and II. 

C. EVIDENCE FOR ASYiMPTOTIC NORMALITY: Next we look at non- 
1' lear regression with imiform design, in table 3. The mean goes to the true 
value 10 as n — > 00. The mean squared error of the estimator goes to zero 
at the proper rate of j^. Thus, when multiplied by y^, the mean squared 
error is as constant as can be expected from simulation. 



Table 3 Moments of b by simulation 
For 6 = 10, nonlinear 



size: 


10 


40 


160 


640 


mean: 


10.87 


10.18 


10.06 


10.02 


f^MSE : 


15.71 


10.71 


10.61 


11.05 


skew: 


1.55 


.57 


.31 


.17 


kurt: 


7.84 


3.56 


3.32 


3.32 



Our skewness is the mean of cubed error from the true value of the 
parameter, and kurtosis is the mean of error to che fourth power. We make 
the skewness and kurtosis dimensionless, independent of scale by dividing 
by the mean squared error to the powers | and 2, respectively. We show 
these moments (estimated from 2000 simulations) in table 3. 

The skewness is going to zero, and the kurtosis to three, which is the 
value of the kurtosis for the standard normal distribution. This implies 
rapid convergence of the distribution of parameter estimates to the normal 
distribution. 

So, one can expect the asymptotic normal approximation to the finite 
sample distribution of (a. 6) to be a good approximation. The asymptotic 
formula is derived in Chapter III. 



11 
10 



III. Analytic Results 

A. ASYMPTOTIC PARAMETER ESTIMATE VARIANCES: Define (a, 6) 
as the constant, true values of the parameters, (a*, 6*) as variables, and 
(a, 6) as parameter estimates. Also, y,- = ax,/(l + 6x,) + €,-. The control 
variable is x, and the response variable is y. 
We recall our sum of squares, 

«-5("-rm;) 

0 so 



da* 
0 = 




We assume that we will use reasonable sequences of designs of ex- 
periment. We assume that for reasonable designs of experiment, a and b 
are consistent estimators of a and 6. We take a large enough sample that 
Aa = a — a, and A6 = 6 — 6 are infinitesimals with high probability. Define: 




The model is linear in a, so the functions are already expanded in Aa. 
We expand by Taylor series in A6 only, about zero. 



Aa; A6) 

- Aa; 0) + A6^(e; Aa; 0) 

n 



\ 3 
X.- \ 



1=1 

correct to infinitesimal of the first order. 

Define = 1 YlU /^f ' similarly for i}!^ = i J2=i ^if^i- 

i/i(e; Aa; A6) = Tila-^ - Aajfia-"^ - ^b^a'^ + AbJPa-^ , 

2 

Next, we multiply by 2- to obtain 

0 ~ ae/z " Aa//2 + A6/I3 - A6€/i2. 
Now we consider the term €fi^, in the A6 term. Since the € 's are i.i.d 



Var(6//2) ^ Var^i^6,/.?j 



! = 1 

= -a fi-^ 
n 



12 



1 ''' 



So, f/z ~ -V(0. y^a'-i-i'^), and the term -A6e/z- is an infinitesimal of 
higher order than A6, Hence, ignoring the -Ahefi'^ term in comparison 
with Aa and Ab : 

a(e//)~[.a^ ~fi^] 
Next, correct to infinitesimal of the first order. 




f,{e; Aa; A6) = /.(f; Aa; 0) + A6|^(e; Aa; 0) 

= ^ififa-^ + //,//?a""^ - (a + Aa)/zja 
j=i 



-3 



+ ^ -2A6e./zJa-^ - 2//iA6//Ja-' + 3A6/zJa 
«=i 



,3^-3 



4^-3 



,4^-3 



We multiply by ^a^ to obtain 



0 = a(e//2) - Aa//3 - 2A6(e//3) + A6//4 

Similarly, we drop the Ahfi^ term, yielding Var(e/(^) 
Now we discard the A6e//^ term, leaving 



Aa 
A6 



H'^nce, 



Aa 

Ab 



=.- M 



Aa 
A6 



and 



Aa" 
A6 



~ aM 



-1 



IIL 
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So rho covariaiice can be found 



as 



|Ccv(a, 6) 



Var(€/i) Cov(c/i, e/i^) 
Cov(e/z, i^) Var(e^) 



M 



-IT 



where 



The need for M to be nonsingular makes it necessary that Jl^-J^-]?' 
IS not 0. A sufficient condition to make M nonsingular is that there be two 
different and nonzero values of x in the design, shown in other writings. 
Next, recalhng the independence of the e's, 



Cov(e/i, €H' ) 



So 



Cov(a, 6) 





/i3- 




>2 


/i3- 

















9 O 



2 ■> 



r-i 0 
[o 1 



//3 ^2 J 



-/i^ -/i3 1 



It is consoling to note that this matches with the formula in Olivci-^ 
up to a clerical error in 01iver^ The error can be removed by substituting 



14 

^5 



the definitions 



^ /J* 3 ^ 



{K + xY ' 



for the definitions 



5 = E 



Next, we check for concordance or discordance between the asymptotic 
results and simulation results. We standardize the mean squared errors by 
I iltiplying by the sample size, so the variances will not go to zero as n 
goes to infinity. For the design (x) = (.1, .2, .3, 1), and replicates 
of this design, with parameters a = 25, 6 = 10, and a — .2, the matrix of 
standardized covariances is as follows. 



nVar(d) nCov(a, 6) 
nCov(a, h) nVar(6) 



488.699 232.302 
232.302 111.829 



Table 4 Comparison of Finite and Asymptotic Moments 

Simulation Asymptotic 
n 10 40 160 320 [Cov(a, 6)] 

nMSE(a) 605.4 508.5 502.2 497.3 488.7 

nCov(a, 6) 325.8 250.1 237.4 235.5 232.3 
nMSE(6) 141.8 117.6 115.2 114.2 111.8 

We compare the standardized mean squared errors and covariances 
obtained from asymptotics with coefficients from simulation, using the same 
designs. The /I^'s and variance-covariance matrix above are the same for 
any number of replications of the same design. 

With increasing sample size, the simulation based estimates approach 
the asymptotic based approximation from above. 

o 15 
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B. OPTLMALITY CRITERION AND ALGORITHM: Next, we define the 
standardized, generalized variance as the determinant of the matrix of stan- 
dardized covariances. We define the optimal design as the design minimizing 
the standardized, generalized variance. We use the asymptotic formulae as 
approximations to the true values. So the standardized, generaUzed vari- 
ance is asymptotically: 



SGVar(a, 6) = 



nVar(a)^ nCov(a, 6) 
nCov(a, b) nVar(6) 



• //•* - fJL^' 



^ ^Ij^i^zing this quantity is equivalent to maximizing its denominator 
divided by with respect to x^-. 



G* = 
Define 



+ 6x. 



n 



I/. = — <x> 2i 



l+6x,' 



" / X \ 



so 1/ takes exactly the range [0, 1]. 

Maximizing G' with respect to (x) is equivalent to maximizing with 
respect to the vector (j/) : ^ 



G' = 



1=1 

Note that this G is not a function of a 



n 



n 



n 



n 2 



i=l 



nor (7-. Hence the optimal design 



is the same for all a and a\ Given the experimenter should recognize 
an approximate value of b and get (xj by solving ... = i±^^^ So 
we work with these u- y a \n il f,^ c^i, i / . i+6x,- 

uiese u^, 6 [U, IJ to solve the problem n generalitv and 
transform to real units when finished. ^ 
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If we iiiultiply each by some constant A: > 1, then the new value of 
G is the old value multipUed by k^. Since each x- is in [0, ooy, each i/,- is 
between zero and one, and the optimal design contains at least one i/^ = 1. 
We set 1/^ = 1. 

Our £jgorithm is as follows. For any n — 1 values, we find the optim£j 
nth value to add. When we have enough observations, we find the best 
value of Uo based on the other u'-s. We continue for u-, i = 2, 3, . . . , n, 
and repeat until the algorithm converges. 

We define 

S(l,m)= ^ "!= (E + ( E 

ie{l, n}^m \ 1=1 / \i=m + l / 

To take parti£Ll derivatives with respect to i^^, we express G as 

G = [5(2, m) + ui] [5(4, m) + uf,,] - [5(3, m) + uif 
= 5(2, m)5(4, m) - 5^(3, in) 
+ 5(2, m)ul + S{4, m)ul-2S{3, m)ul 



dG 

— = 0 = 45(2, my^-eS{3, m)ul + 2S{i, m)u^ 



m 



= [25(2, m)ui - 35(3, m)u^ + 5(4, m)] 

with solutions 

35(3, m)± [952(3, m)-85(2, m)5(4, m)]^ 

V =0 = — 

' 45(2, m) 

The double partial derivative should be negative. 

^ = 125(2, m)vl - 125(3, m)t/^ + 25(4, m). 
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For 1/^ = 0, the double partial is 25(4, m) > 0 so we can discard 
I'rn = 0. For the other two roots, define the discriminant as 

c? = 95^(3, m)-85(2, m)5(4, m). 



- ^|^t35(3. '"j ± rf'l + 25(4, m). 



Next, 



25(2, m)^= 95^(3, m) - 85(2, m)5(4, m) ± 35(3, m)d^. 

The smaller double derivative could be negative or zero, if the discrim- 
inant is zero: 

d^G _ c?±35(3, 7n)d2 
dul - 25(2, m) • 

The interest in this value of is not based on knowing that it is a 
relative maximum, but on knowing that it is the only interior point which 
could be a relative maximum. 

Next, we used APL (A Programming Language, source code available 
on request,) to compute the sequence of conditionally optimal observations. 
Table 5 contains the results. The code was set up to choose i/j = .5 when 
there was a choice, that is, when the value of G is the same for either choice. 
All aspects of this table inspire the lemma to follow. Firstly, after the initi<il 
three, the designs added are alternating between .5 and 1. There is a choice 
for odd sample sizes, but no choice otherwise. 
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l^ble 5 

Table of Steps in the Optimal Design Algorithm 

Values of G 



Observation 


1/ = .5 


1/ = 1 


u 


1 






1 


2 


.0625 


0 


.5 


3 


.125 


.125 


.5 


4 


.1875 


.25 


1 


5 


.375 


.375 


.5 


6 


.5 


.5625 


1 


7 


.75 


.75 


.5 


8 


.9375 


1 


1 


9 


1.25 


1.25 


.5 


10 


1.5 


1.5625 


1 


11 


1.875 


1.875 


.5 


12 


2.1875 


2.25 


1 


13 


2 625 


2 625 


5 


14 


3 


3.0625 


1 


15 


3.5 


3.5 


.5 


16 


3.9375 


4 


1 


17 


4.5 


4.5 


.5 


18 


5 


5.0625 


1 


19 


5.625 


5.625 


.5 


20 


6.1875 


6.25 


1 



As a check on our proof of the lemma, we note that when there is a 
difference in Table 5, the difference is .0625 = 1/16, the same as in Cases 1 
and 3 of the proof of Lemma 1. 



C. LOCAL OPTIMALITY OF THE DESIGNS: 

Lemma. Let us choose designs for Langmuir's model. Given a sample, 
we add an x which is optimal. We set the first x equal to x^ without loss 
of generality. For even sample sizes, the result is half at and half 
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at x^. For odd sample sizes we take {n±l)/2 observation at ott?=— and 
{n^l)/2 at x^. 

Rirthermore, this design cannot be improved on by replacing an ob- 
servation with a best value conditioned on the remaining observations. 

Proof of the Lemma 

For ease, we work with showing = i to yield x = , and 

= 1 to yield x = x^. From the previous argument that at least°one i/,. 
must equal one, we set i^j = 1. We add the nth observation. 

Suppose the previous design had m of i^,. = and n-m-lofi/=l. 
First we compute the 5(/, n)'s. When is the smaller root |? 



UJ +("~^~^)^ tT^ — 

471 — 3m - 4 
4 

8n - 7m - S 
8 

16n - 15m - 16 
16 ' 

and 

_ 35(3, n)-|[35(3, n)]'-85(2, n)5(4, n)}' 

45(2, n) • 

The discriminant d is 



5(/, n) = 

5(2, n) = 

5(3, n) = 

5(4, n) = 



64n2 + Slm2 + 64 - 144n7n - 128n + 144?n 



d = 



64 



Sn - 9m - 8' 
8 



20 



21 



4 



This leads to 

_ 24n - 21m - 24 - |8n - 9m - 8| 
~ 32n - 24m - 32 



{ 



i if 8n - 9m - 8 > 0 

32n-30m-32 nfViPrwkp 

32„-24m-32 otnerwisc 



This means the potential relative maximum is at i^^ = | if eight or less 
I/,- = ^'s axe present per nine i^,'s in the previous design. 

Next, we compute and maximize G for m of ^' = ^ and n — m of = 1. 



and 1^ = -^^^1^, so m = I is ideal with m = being best if n is odd, 
since m must be an integer. Thus the first part of the Lemma generates 
a sequence of designs with m = |- for n even, and m = for n odd, 
satisfying the conditions for ^ to be the second zero in each case. The 
second part of the Lemma similarly holds, the most extreme cases being 
even n after deletion and m = which again satisfies the condition 
< |, using n instead of n — 1 since the n in our formula for G is 7i — 1 
in the formula for the partial derivative. So the lemma is true. 



IV. Simulation Based Results 2 

We compare four designs. The first is (1/n, . . . n/n), called Arithmetic 
1. The second is (3/n, 6/n, ... 3 n/n), called Arithmetic 3. The maximum 
value = 3 is chosen from talks with Anita Katti. The third design is 
our optimal allocation, with half the observations at = 3 and half at 
oTk^ = .09375. The fourth design, called Logarithmic, is uniform on the 
log scale with maximum value 3 and no replication. 

21 
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Table 6 


Means of a, with a = 


25. 


Sample Size: 


10 


40 


160 


640 


Arithmetic 1: 


26.81 


25.37 


25.13 


25.03 


Arithmetic 3: 


36.17 


25.67 


25.21 


25.04 


Logaiithmic: 


33.47 


26.80 


25.49 


25.12 


Optimal: 


25.41 


25.14 


25.05 


25.00 



The least value of x in the Logarithmic design is computed to minimize 
the generalized (asymptotic) variance, yielding Xj = 0.327943, 0.363068, 
0.370807, and 0.372685 for sample sizes 10, 40, 160, and 640 respectively. 
The least value in the Logarithmic design is easy enough to compute since 
there is only one quantity to vary, namely the smallest x to be used. 

Five samples out of 2000 in the Logarithmic design yielded poor results, 
namely least squares estimates of 6 in excess of 2000 with a true value of 
10, and their estimates, 6, were set equal to zero. Similarly, five samples 
out of 2000 in the Arithmetic 3 design yielded poor results. Each of these 
five samples are included in the statistics for the logarithmic design and 
arithmetic designs respectively. 

Note that the means for Arithmetic 1 and Optimal designs are fairly 
close to the true value of 25. For the (estimated) standard deviations, the 
optimal design is the best, by a factor of \/3 or better. 

Table 7 Standard Deviations of d, with a = 25. 



Sample Size: 


10 


40 


160 


640 


Arithmetic 1: 


8.46 


3.43 


1.69 


.86 


Arithmetic 3: 


98.35 


4.71 


2.19 


1.11 


Logarithmic: 


47.22 


8.32 


3.46 


1.68 


Optimal: 


3.99 


1.97 


.97 


.47 



Next, we look at graphs of the .90 midranges for the three designs. To 
make the comparisons easier, we subtract the true mean from the Y's. These 
midranges are obtained by computing at each value of x, the 2000 values 
of y, rank them and graph the Y - £{Y) at the .05 level, y - S{Y), and 
graph the Y - S{Y') at the .95 level. The boundary curves of the midranges 
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cross each other repeatedly, so there is no great disadvantage in one design 
over another, aside from logarithmic uniform being a bit off for x around 
0.2. 

.90 Midranges from Different Designs 




A. Sample Size n=10 B. Sample Size n=40 

Fig. 3. Key: Arithmetic Uniform 3: short dashes; Optimal: 
thick solid; Logarithmic Uniform: long dashes. 



As we increase the sample size, the midrauge curves become consider- 
ably closer to the true curve and to each other. 

We kept track of the ten samples from the two allocations which led 
to unacceptable estimates of 6, that is estimates in excess of 2000 with a 
true value of 10. The samples are available on request. It appears that the 
logarithmic and "arithmetic 3" designs place so few observations near the 
origin and near the maximum value that the estimation is unstable. 

In summary, there does not seem to be any serious loss cissociated with 
the use of our locally optimal design. To the contrary, our design might 
reduce the sample size needed by a factor of three or better. 
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V. Regarding Global Optimality 

We check whether we can improve on our locally optimal solution by 
changing all of the observations simultaneously. Let m = n/2 or m = 
(w ± l)/2 as appropriate. We denote by <f • the changes to the first m values 
of z/. and by -/., the changes to the second half. The ranges of d- are 
[-|, |] and of /,• are [0, l]. The expression for G becomes: 



1=1 ^ ^ i=m+l 

"I X ^ V 3 n 

eG^''.) + E 

Li=l ^ ^ i=m+l 



Next, we define 



m 



^it = E^'' E A: = l, 2, 3, 4. 



1=1 



i=mf 1 



We rewrite G as: 



4n - 3m 



16n - 15m D. ZDo 

16 + T + ^^^3 + ^4-4^1+6^2 + F 



Sri -7m ZD, 3Do 



The interesting part is that regardless of the value of n, the problem of 
optimal design reduces to these eight variables, D = {Df^; F^.). We expanded 
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the polynomial laanually and by Mathematica. Since the polynomial is a 
quadratic form in eight variables, we write it in the matrix format as follows. 



m(n — m) 
16 



+ D 



0 

'^Sn+dm 
16 
m 
4 

4n-3m 
4 

8 

16n-3m 
16 

"Sn+Sm 
4 

4m— 3m 



r 1 


1 


1 


1 


1 


1 


5 


1 -J 


16 


8 


4 


2 


4 


4 


o 


1 


3 


1 


1 


1 


3 


1 


1 


8 


4 


2 


2 


4 


2 


2 


1 

4 


1 
2 


-1 


0 


1 


-2 


1 


0 


1 


1 

2 


0 


0 


-1 


1 

2 


0 


0 


4 


1 


1 


-1 


-1 


1 


1 


-1 


1 


3 
4 


-2 


i 

2 


1 


-3 


1 


1 

2 


5 

J 


1 


1 


0 


1 


1 


-1 


0 


2 


2 


0 


0 


-1 


1 

2 


0 


0 . 



D 



The eigenvalues E, of the 8x8 matrix and the matrix M of eigenvectors 



is: 



E = 



M = 



.000 


.000 


.000 


.000 


.000 


2.109 


3.266 


5.65 


-.160 


.654 


.421 


.044 


.066 


.473 


.324 


.18o- 


.589 


.051 


.012 


.121 


.663 


.046 


.382 


.222 


.344 


.312 


.245 


.402 


.489 


.324 


.088 


.460 


.435 


.466 


.566 


.177 


.170 


.369 


.247 


.124 


.024 


.288 


.383 


.027 


.434 


.461 


.516 


.320 


.477 


.129 


.333 


.131 


.270 


.114 


.121 


.726 


.253 


.373 


.428 


.239 


.034 


.414 


.582 


.212 


..168 


.125 


.036 


.846 


.164 


.369 


.247 


.124. 
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It is interesting that five of the eight eigenvalues are zero and only 
threa are nonzero. Thus, the optimal design does not depend on all eight 
variables independently. It depends on only three functions of them. 

The range of the three functions is not nice. We could not proceed 
any further along this line. So, we checked on the global optimality by 
computing the function G over a fine grid, of size 100^ over all of the eight 
\-ariables. D = [D^., Ff^]. With the above information, computation over 
such a large grid was not excessively time consuming. The computations 
confirmed that our design was globally optimal. 

This is our current evidence that the design is globally optimal if half 
the observations are taken at the maximal value and half where the u's 
equal half their maximal value. 



VI. Conclusion 

In conclusion, we suggest the use of whatever model is expected to 
have constant errors in the dependent variable. For the usual assumptions 
of normality and constant variance, we f irther suggest the design of half 
the observations to the maximum value, x = x^, and half to x = 2+fe~5 to 
maximize the accuracy of parameter estimation. It is interesting cha7 the 
statistical assumptions are as important as they seem,. It is also interesting 
that the optimal design seems to be as simple and extreme as the optimal 
desig: in simple linear regression. 
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